Abstract: Mining high utility item set from large database refers to the discovery of item sets with high utility like profits. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing large number of candidate item set for high utility item sets. Such a large number of candidate item set degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains large number of long transactions or long high utility item sets (HUIs). Utility mining is the best solution for the above problems explained. In utility mining, each item is associated with a utility (e.g. unit profit) and an occurrence count in each transaction (e.g. quantity). The utility of an item set represents its importance, which can be measured in terms of weight, value, quantity or other information depending on the user specification. The algorithm used here is Modified LP-Tree algorithm (Modified Linear Prefix Tree) for mining high utility item sets with a set of techniques for pruning candidate item sets. The information of high utility item sets is maintained in a special data structure named Modified LP-Tree such that the candidate item sets can be generated efficiently with only two scans of the database. This method not only reduces the number of candidates effectively but also out performs other algorithms substantially in terms of execution time, especially when the database contains lots of long transactions.
Keywords: Data mining, high utility item sets, Modified LP-Tree algorithm.